Overview

Dataset statistics

Number of variables16
Number of observations1442
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory170.6 KiB
Average record size in memory121.2 B

Variable types

DateTime1
Categorical4
Numeric11

Warnings

username has constant value "jack" Constant
tweet has a high cardinality: 1438 distinct values High cardinality
mentions is highly correlated with number of tweetsHigh correlation
video is highly correlated with photosHigh correlation
photos is highly correlated with videoHigh correlation
replies_count is highly correlated with retweets_count and 1 other fieldsHigh correlation
retweets_count is highly correlated with replies_count and 1 other fieldsHigh correlation
likes_count is highly correlated with replies_count and 1 other fieldsHigh correlation
number of tweets is highly correlated with mentionsHigh correlation
video is highly correlated with photosHigh correlation
photos is highly correlated with videoHigh correlation
replies_count is highly correlated with retweets_count and 1 other fieldsHigh correlation
retweets_count is highly correlated with replies_count and 1 other fieldsHigh correlation
likes_count is highly correlated with replies_count and 1 other fieldsHigh correlation
video is highly correlated with photosHigh correlation
photos is highly correlated with videoHigh correlation
replies_count is highly correlated with retweets_count and 1 other fieldsHigh correlation
retweets_count is highly correlated with replies_count and 1 other fieldsHigh correlation
likes_count is highly correlated with replies_count and 1 other fieldsHigh correlation
urls is highly correlated with mentions and 1 other fieldsHigh correlation
replies_count is highly correlated with retweets_count and 1 other fieldsHigh correlation
video is highly correlated with photosHigh correlation
mentions is highly correlated with urls and 2 other fieldsHigh correlation
retweets_count is highly correlated with replies_count and 1 other fieldsHigh correlation
photos is highly correlated with video and 1 other fieldsHigh correlation
bins is highly correlated with percent changeHigh correlation
number of tweets is highly correlated with urls and 2 other fieldsHigh correlation
hashtags is highly correlated with mentions and 2 other fieldsHigh correlation
likes_count is highly correlated with replies_count and 1 other fieldsHigh correlation
percent change is highly correlated with binsHigh correlation
cashtags is highly correlated with usernameHigh correlation
bins is highly correlated with usernameHigh correlation
username is highly correlated with cashtags and 1 other fieldsHigh correlation
hashtags is highly skewed (γ1 = 25.37077893) Skewed
tweet is uniformly distributed Uniform
date has unique values Unique
mentions has 992 (68.8%) zeros Zeros
hashtags has 1168 (81.0%) zeros Zeros
video has 1130 (78.4%) zeros Zeros
photos has 1134 (78.6%) zeros Zeros
urls has 751 (52.1%) zeros Zeros
retweets_count has 42 (2.9%) zeros Zeros
percent change has 27 (1.9%) zeros Zeros

Reproduction

Analysis started2021-09-27 19:03:15.735864
Analysis finished2021-09-27 19:03:40.160606
Duration24.42 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

date
Date

UNIQUE

Distinct1442
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
Minimum2016-08-24 16:00:00
Maximum2021-07-20 09:30:00
2021-09-27T15:03:40.318887image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:40.522374image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

tweet
Categorical

HIGH CARDINALITY
UNIFORM

Distinct1438
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
This is great
 
3
Facts
 
2
Good morning
 
2
@dangillmor the process is not as clear as it could be. That's on us. We're taking all the feedback to make it better!
 
1
https://t.co/wa5c3lv21c
 
1
Other values (1433)
1433 

Length

Max length11161
Median length94
Mean length248.8932039
Min length1

Characters and Unicode

Total characters358904
Distinct characters280
Distinct categories19 ?
Distinct scripts5 ?
Distinct blocks16 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1435 ?
Unique (%)99.5%

Sample

1st row@dangillmor the process is not as clear as it could be. That's on us. We're taking all the feedback to make it better!
2nd row@Stammy @Square @niw @RuaneFootball @martucci @cjburrows @thecleanmachine @goyal_soniya @yuriyr @pon_brandon @LindaxGuo 💯
3rd row@Jayanta and happy birthday! 🙌🏼🙌🏼🙌🏼 https://t.co/xbd8inh0Ky Power your TouchBistro or Vend point of sale with Square payments and hardware! https://t.co/VYBWJtSyhR
4th row💬💸 https://t.co/fBhEDusY7a
5th row"Hey Siri, send John $10" https://t.co/0gbGP0jpzd

Common Values

ValueCountFrequency (%)
This is great3
 
0.2%
Facts2
 
0.1%
Good morning2
 
0.1%
@dangillmor the process is not as clear as it could be. That's on us. We're taking all the feedback to make it better!1
 
0.1%
https://t.co/wa5c3lv21c1
 
0.1%
NBA Finals: Raptors and Warriors face off in first Finals game in Canada #NBAFinals https://t.co/Rs8IH3jdbx1
 
0.1%
@nedsegal @paraga @boo Smart @paraga @boo I know exactly how he feels.1
 
0.1%
Welcome to the flock @dantley! Grateful we get to work w you.1
 
0.1%
https://t.co/XDwGVqypgZ Bucks at Raptors #MILvsTOR https://t.co/H41L7NzioH https://t.co/2M8Tc6zc65 @nilsfrahm1
 
0.1%
What an amazing group!1
 
0.1%
Other values (1428)1428
99.0%

Length

2021-09-27T15:03:41.076799image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to1683
 
3.2%
and1421
 
2.7%
the1364
 
2.6%
a848
 
1.6%
we763
 
1.5%
of747
 
1.4%
you744
 
1.4%
for654
 
1.3%
this551
 
1.1%
is510
 
1.0%
Other values (10133)42752
82.2%

Most occurring characters

ValueCountFrequency (%)
52456
 
14.6%
e28795
 
8.0%
t25478
 
7.1%
o21652
 
6.0%
a20818
 
5.8%
n17490
 
4.9%
i17350
 
4.8%
s16100
 
4.5%
r15898
 
4.4%
h11103
 
3.1%
Other values (270)131764
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter257299
71.7%
Space Separator52460
 
14.6%
Uppercase Letter20706
 
5.8%
Other Punctuation20239
 
5.6%
Decimal Number4351
 
1.2%
Final Punctuation1136
 
0.3%
Other Symbol863
 
0.2%
Connector Punctuation390
 
0.1%
Modifier Symbol223
 
0.1%
Close Punctuation220
 
0.1%
Other values (9)1017
 
0.3%

Most frequent character per category

Other Symbol
ValueCountFrequency (%)
👏143
16.6%
🙏99
 
11.5%
81
 
9.4%
💯58
 
6.7%
27
 
3.1%
👋27
 
3.1%
👇27
 
3.1%
🇬17
 
2.0%
👌16
 
1.9%
🇳16
 
1.9%
Other values (149)352
40.8%
Lowercase Letter
ValueCountFrequency (%)
e28795
 
11.2%
t25478
 
9.9%
o21652
 
8.4%
a20818
 
8.1%
n17490
 
6.8%
i17350
 
6.7%
s16100
 
6.3%
r15898
 
6.2%
h11103
 
4.3%
l10615
 
4.1%
Other values (19)72000
28.0%
Uppercase Letter
ValueCountFrequency (%)
T2131
 
10.3%
S1494
 
7.2%
I1400
 
6.8%
W1378
 
6.7%
A1310
 
6.3%
C1065
 
5.1%
M928
 
4.5%
B847
 
4.1%
N825
 
4.0%
L766
 
3.7%
Other values (16)8562
41.4%
Other Punctuation
ValueCountFrequency (%)
@5426
26.8%
/4770
23.6%
.4488
22.2%
:1869
 
9.2%
,1306
 
6.5%
!979
 
4.8%
#431
 
2.1%
?344
 
1.7%
'333
 
1.6%
"68
 
0.3%
Other values (9)225
 
1.1%
Decimal Number
ValueCountFrequency (%)
1646
14.8%
0615
14.1%
2530
12.2%
4407
9.4%
3404
9.3%
5394
9.1%
7348
8.0%
9339
7.8%
6335
7.7%
8333
7.7%
Other Letter
ValueCountFrequency (%)
6
54.5%
1
 
9.1%
1
 
9.1%
1
 
9.1%
1
 
9.1%
1
 
9.1%
Math Symbol
ValueCountFrequency (%)
+12
48.0%
~6
24.0%
=5
20.0%
|1
 
4.0%
1
 
4.0%
Modifier Symbol
ValueCountFrequency (%)
🏼195
87.4%
🏻13
 
5.8%
¯12
 
5.4%
🏽3
 
1.3%
Currency Symbol
ValueCountFrequency (%)
$95
97.9%
£1
 
1.0%
1
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
-196
93.3%
13
 
6.2%
1
 
0.5%
Format
ValueCountFrequency (%)
64
44.8%
63
44.1%
16
 
11.2%
Space Separator
ValueCountFrequency (%)
52456
> 99.9%
 4
 
< 0.1%
Initial Punctuation
ValueCountFrequency (%)
156
86.2%
25
 
13.8%
Final Punctuation
ValueCountFrequency (%)
976
85.9%
160
 
14.1%
Open Punctuation
ValueCountFrequency (%)
(206
99.5%
[1
 
0.5%
Close Punctuation
ValueCountFrequency (%)
)219
99.5%
]1
 
0.5%
Connector Punctuation
ValueCountFrequency (%)
_390
100.0%
Nonspacing Mark
ValueCountFrequency (%)
136
100.0%
Enclosing Mark
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin278005
77.5%
Common80729
 
22.5%
Inherited159
 
< 0.1%
Katakana6
 
< 0.1%
Hangul5
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
52456
65.0%
@5426
 
6.7%
/4770
 
5.9%
.4488
 
5.6%
:1869
 
2.3%
,1306
 
1.6%
!979
 
1.2%
976
 
1.2%
1646
 
0.8%
0615
 
0.8%
Other values (206)7198
 
8.9%
Latin
ValueCountFrequency (%)
e28795
 
10.4%
t25478
 
9.2%
o21652
 
7.8%
a20818
 
7.5%
n17490
 
6.3%
i17350
 
6.2%
s16100
 
5.8%
r15898
 
5.7%
h11103
 
4.0%
l10615
 
3.8%
Other values (45)92706
33.3%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Inherited
ValueCountFrequency (%)
136
85.5%
16
 
10.1%
7
 
4.4%
Katakana
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII356124
99.2%
Punctuation1520
 
0.4%
None693
 
0.2%
Emoticons137
 
< 0.1%
VS136
 
< 0.1%
Enclosed Alphanum Sup104
 
< 0.1%
Dingbats89
 
< 0.1%
Misc Symbols51
 
< 0.1%
Latin 1 Sup34
 
< 0.1%
Katakana6
 
< 0.1%
Other values (6)10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
52456
14.7%
e28795
 
8.1%
t25478
 
7.2%
o21652
 
6.1%
a20818
 
5.8%
n17490
 
4.9%
i17350
 
4.9%
s16100
 
4.5%
r15898
 
4.5%
h11103
 
3.1%
Other values (79)128984
36.2%
None
ValueCountFrequency (%)
🏼195
28.1%
👏143
20.6%
💯58
 
8.4%
👋27
 
3.9%
👇27
 
3.9%
👌16
 
2.3%
👀14
 
2.0%
🏻13
 
1.9%
🤔11
 
1.6%
💙10
 
1.4%
Other values (98)179
25.8%
Emoticons
ValueCountFrequency (%)
🙏99
72.3%
🙌10
 
7.3%
😐5
 
3.6%
😬4
 
2.9%
🙋4
 
2.9%
🙄3
 
2.2%
😉3
 
2.2%
😎2
 
1.5%
😮1
 
0.7%
😊1
 
0.7%
Other values (5)5
 
3.6%
Enclosed Alphanum Sup
ValueCountFrequency (%)
🇬17
16.3%
🇳16
15.4%
🆕13
12.5%
🇯8
7.7%
🇵7
 
6.7%
🇲6
 
5.8%
🇺5
 
4.8%
🇷5
 
4.8%
🇸4
 
3.8%
🇧4
 
3.8%
Other values (10)19
18.3%
Punctuation
ValueCountFrequency (%)
976
64.2%
160
 
10.5%
156
 
10.3%
64
 
4.2%
63
 
4.1%
45
 
3.0%
25
 
1.6%
16
 
1.1%
13
 
0.9%
1
 
0.1%
Misc Symbols
ValueCountFrequency (%)
27
52.9%
14
27.5%
4
 
7.8%
2
 
3.9%
1
 
2.0%
1
 
2.0%
1
 
2.0%
1
 
2.0%
VS
ValueCountFrequency (%)
136
100.0%
Latin 1 Sup
ValueCountFrequency (%)
¯12
35.3%
§7
20.6%
°4
 
11.8%
 4
 
11.8%
é3
 
8.8%
¡1
 
2.9%
í1
 
2.9%
ã1
 
2.9%
£1
 
2.9%
Dingbats
ValueCountFrequency (%)
81
91.0%
2
 
2.2%
1
 
1.1%
1
 
1.1%
1
 
1.1%
1
 
1.1%
1
 
1.1%
1
 
1.1%
Katakana
ValueCountFrequency (%)
6
100.0%
Misc Technical
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Hangul
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Arrows
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
100.0%
Geometric Shapes Ext
ValueCountFrequency (%)
🟩1
100.0%

username
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
jack
1442 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters5768
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowjack
2nd rowjack
3rd rowjack
4th rowjack
5th rowjack

Common Values

ValueCountFrequency (%)
jack1442
100.0%

Length

2021-09-27T15:03:41.484200image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-27T15:03:41.601242image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
jack1442
100.0%

Most occurring characters

ValueCountFrequency (%)
j1442
25.0%
a1442
25.0%
c1442
25.0%
k1442
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5768
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
j1442
25.0%
a1442
25.0%
c1442
25.0%
k1442
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5768
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
j1442
25.0%
a1442
25.0%
c1442
25.0%
k1442
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5768
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
j1442
25.0%
a1442
25.0%
c1442
25.0%
k1442
25.0%

mentions
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct17
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6463245492
Minimum0
Maximum53
Zeros992
Zeros (%)68.8%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:41.687259image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum53
Range53
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.093847025
Coefficient of variation (CV)3.239621685
Kurtosis307.465678
Mean0.6463245492
Median Absolute Deviation (MAD)0
Skewness14.33110573
Sum932
Variance4.384195364
MonotonicityNot monotonic
2021-09-27T15:03:41.812904image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
0992
68.8%
1273
 
18.9%
296
 
6.7%
334
 
2.4%
420
 
1.4%
58
 
0.6%
84
 
0.3%
72
 
0.1%
112
 
0.1%
62
 
0.1%
Other values (7)9
 
0.6%
ValueCountFrequency (%)
0992
68.8%
1273
 
18.9%
296
 
6.7%
334
 
2.4%
420
 
1.4%
58
 
0.6%
62
 
0.1%
72
 
0.1%
84
 
0.3%
92
 
0.1%
ValueCountFrequency (%)
531
 
0.1%
311
 
0.1%
181
 
0.1%
131
 
0.1%
121
 
0.1%
112
0.1%
102
0.1%
92
0.1%
84
0.3%
72
0.1%

hashtags
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct8
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2871012483
Minimum0
Maximum42
Zeros1168
Zeros (%)81.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:41.935603image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.26368706
Coefficient of variation (CV)4.40153802
Kurtosis825.3167325
Mean0.2871012483
Median Absolute Deviation (MAD)0
Skewness25.37077893
Sum414
Variance1.596904985
MonotonicityNot monotonic
2021-09-27T15:03:42.042237image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
01168
81.0%
1202
 
14.0%
254
 
3.7%
39
 
0.6%
46
 
0.4%
51
 
0.1%
421
 
0.1%
61
 
0.1%
ValueCountFrequency (%)
01168
81.0%
1202
 
14.0%
254
 
3.7%
39
 
0.6%
46
 
0.4%
51
 
0.1%
61
 
0.1%
421
 
0.1%
ValueCountFrequency (%)
421
 
0.1%
61
 
0.1%
51
 
0.1%
46
 
0.4%
39
 
0.6%
254
 
3.7%
1202
 
14.0%
01168
81.0%

cashtags
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size11.4 KiB
0
1437 
1
 
4
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1442
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01437
99.7%
14
 
0.3%
21
 
0.1%

Length

2021-09-27T15:03:42.288629image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-27T15:03:42.364701image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
01437
99.7%
14
 
0.3%
21
 
0.1%

Most occurring characters

ValueCountFrequency (%)
01437
99.7%
14
 
0.3%
21
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1442
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01437
99.7%
14
 
0.3%
21
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common1442
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01437
99.7%
14
 
0.3%
21
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1442
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01437
99.7%
14
 
0.3%
21
 
0.1%

video
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct8
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3210818308
Minimum0
Maximum7
Zeros1130
Zeros (%)78.4%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:42.438250image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7681819305
Coefficient of variation (CV)2.392480224
Kurtosis18.77912957
Mean0.3210818308
Median Absolute Deviation (MAD)0
Skewness3.694846762
Sum463
Variance0.5901034784
MonotonicityNot monotonic
2021-09-27T15:03:42.538683image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
01130
78.4%
1226
 
15.7%
251
 
3.5%
319
 
1.3%
48
 
0.6%
54
 
0.3%
62
 
0.1%
72
 
0.1%
ValueCountFrequency (%)
01130
78.4%
1226
 
15.7%
251
 
3.5%
319
 
1.3%
48
 
0.6%
54
 
0.3%
62
 
0.1%
72
 
0.1%
ValueCountFrequency (%)
72
 
0.1%
62
 
0.1%
54
 
0.3%
48
 
0.6%
319
 
1.3%
251
 
3.5%
1226
 
15.7%
01130
78.4%

photos
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct12
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3765603329
Minimum0
Maximum13
Zeros1134
Zeros (%)78.6%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:42.644043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum13
Range13
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.057187731
Coefficient of variation (CV)2.80748565
Kurtosis47.09885571
Mean0.3765603329
Median Absolute Deviation (MAD)0
Skewness5.693583242
Sum543
Variance1.117645898
MonotonicityNot monotonic
2021-09-27T15:03:42.765976image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
01134
78.6%
1208
 
14.4%
248
 
3.3%
324
 
1.7%
414
 
1.0%
74
 
0.3%
53
 
0.2%
82
 
0.1%
132
 
0.1%
61
 
0.1%
Other values (2)2
 
0.1%
ValueCountFrequency (%)
01134
78.6%
1208
 
14.4%
248
 
3.3%
324
 
1.7%
414
 
1.0%
53
 
0.2%
61
 
0.1%
74
 
0.3%
82
 
0.1%
91
 
0.1%
ValueCountFrequency (%)
132
 
0.1%
111
 
0.1%
91
 
0.1%
82
 
0.1%
74
 
0.3%
61
 
0.1%
53
 
0.2%
414
 
1.0%
324
1.7%
248
3.3%

urls
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct11
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7593619972
Minimum0
Maximum16
Zeros751
Zeros (%)52.1%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:42.896007image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum16
Range16
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.152343842
Coefficient of variation (CV)1.517515818
Kurtosis28.86571183
Mean0.7593619972
Median Absolute Deviation (MAD)0
Skewness3.68603954
Sum1095
Variance1.327896331
MonotonicityNot monotonic
2021-09-27T15:03:43.010519image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0751
52.1%
1463
32.1%
2142
 
9.8%
348
 
3.3%
415
 
1.0%
514
 
1.0%
63
 
0.2%
73
 
0.2%
161
 
0.1%
101
 
0.1%
ValueCountFrequency (%)
0751
52.1%
1463
32.1%
2142
 
9.8%
348
 
3.3%
415
 
1.0%
514
 
1.0%
63
 
0.2%
73
 
0.2%
91
 
0.1%
101
 
0.1%
ValueCountFrequency (%)
161
 
0.1%
101
 
0.1%
91
 
0.1%
73
 
0.2%
63
 
0.2%
514
 
1.0%
415
 
1.0%
348
 
3.3%
2142
 
9.8%
1463
32.1%

replies_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct533
Distinct (%)37.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean549.4036061
Minimum0
Maximum68375
Zeros4
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:43.153513image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q127
median81
Q3243.75
95-th percentile1562.5
Maximum68375
Range68375
Interquartile range (IQR)216.75

Descriptive statistics

Standard deviation3033.749321
Coefficient of variation (CV)5.521895538
Kurtosis244.7361389
Mean549.4036061
Median Absolute Deviation (MAD)68
Skewness14.16221557
Sum792240
Variance9203634.942
MonotonicityNot monotonic
2021-09-27T15:03:43.316966image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
324
 
1.7%
823
 
1.6%
622
 
1.5%
921
 
1.5%
1120
 
1.4%
420
 
1.4%
518
 
1.2%
1216
 
1.1%
1914
 
1.0%
1014
 
1.0%
Other values (523)1250
86.7%
ValueCountFrequency (%)
04
 
0.3%
18
 
0.6%
214
1.0%
324
1.7%
420
1.4%
518
1.2%
622
1.5%
712
0.8%
823
1.6%
921
1.5%
ValueCountFrequency (%)
683751
0.1%
435681
0.1%
380451
0.1%
355561
0.1%
314581
0.1%
275661
0.1%
223421
0.1%
218131
0.1%
192641
0.1%
128511
0.1%

retweets_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct609
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1173.194175
Minimum0
Maximum151587
Zeros42
Zeros (%)2.9%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:43.480074image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q127
median113
Q3366.75
95-th percentile3147.85
Maximum151587
Range151587
Interquartile range (IQR)339.75

Descriptive statistics

Standard deviation8060.152803
Coefficient of variation (CV)6.870263233
Kurtosis231.2194763
Mean1173.194175
Median Absolute Deviation (MAD)107
Skewness14.34012873
Sum1691746
Variance64966063.22
MonotonicityNot monotonic
2021-09-27T15:03:43.638516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
247
 
3.3%
145
 
3.1%
042
 
2.9%
334
 
2.4%
428
 
1.9%
520
 
1.4%
614
 
1.0%
813
 
0.9%
2710
 
0.7%
2210
 
0.7%
Other values (599)1179
81.8%
ValueCountFrequency (%)
042
2.9%
145
3.1%
247
3.3%
334
2.4%
428
1.9%
520
1.4%
614
 
1.0%
78
 
0.6%
813
 
0.9%
98
 
0.6%
ValueCountFrequency (%)
1515871
0.1%
1434481
0.1%
1379591
0.1%
908531
0.1%
823841
0.1%
714941
0.1%
539411
0.1%
406471
0.1%
366221
0.1%
326421
0.1%

likes_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1095
Distinct (%)75.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5022.89251
Minimum1
Maximum778372
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:43.803948image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile20
Q1217
median661
Q32069.5
95-th percentile18770.2
Maximum778372
Range778371
Interquartile range (IQR)1852.5

Descriptive statistics

Standard deviation28776.07576
Coefficient of variation (CV)5.728984982
Kurtosis419.6634652
Mean5022.89251
Median Absolute Deviation (MAD)580.5
Skewness18.2259457
Sum7243011
Variance828062536
MonotonicityNot monotonic
2021-09-27T15:03:43.970651image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
177
 
0.5%
147
 
0.5%
117
 
0.5%
166
 
0.4%
195
 
0.3%
65
 
0.3%
595
 
0.3%
355
 
0.3%
95
 
0.3%
205
 
0.3%
Other values (1085)1385
96.0%
ValueCountFrequency (%)
12
 
0.1%
32
 
0.1%
43
0.2%
52
 
0.1%
65
0.3%
74
0.3%
83
0.2%
95
0.3%
103
0.2%
117
0.5%
ValueCountFrequency (%)
7783721
0.1%
4510991
0.1%
3115291
0.1%
2934961
0.1%
1930731
0.1%
1619741
0.1%
1410801
0.1%
1151081
0.1%
1119721
0.1%
1047971
0.1%

number of tweets
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct36
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.441054092
Minimum1
Maximum162
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:44.127187image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile11
Maximum162
Range161
Interquartile range (IQR)2

Descriptive statistics

Standard deviation7.382552043
Coefficient of variation (CV)2.145433302
Kurtosis202.5780047
Mean3.441054092
Median Absolute Deviation (MAD)1
Skewness11.92597833
Sum4962
Variance54.50207467
MonotonicityNot monotonic
2021-09-27T15:03:44.267357image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1646
44.8%
2306
21.2%
3151
 
10.5%
497
 
6.7%
554
 
3.7%
637
 
2.6%
731
 
2.1%
819
 
1.3%
1015
 
1.0%
1113
 
0.9%
Other values (26)73
 
5.1%
ValueCountFrequency (%)
1646
44.8%
2306
21.2%
3151
 
10.5%
497
 
6.7%
554
 
3.7%
637
 
2.6%
731
 
2.1%
819
 
1.3%
910
 
0.7%
1015
 
1.0%
ValueCountFrequency (%)
1621
0.1%
1111
0.1%
971
0.1%
571
0.1%
561
0.1%
461
0.1%
432
0.1%
402
0.1%
381
0.1%
371
0.1%

price
Real number (ℝ≥0)

Distinct1211
Distinct (%)84.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.50781671
Minimum14.30000019
Maximum76.87000275
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size11.4 KiB
2021-09-27T15:03:44.421497image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum14.30000019
5-th percentile16.00150003
Q119.83999968
median31.23666668
Q336.25749874
95-th percentile57.18533293
Maximum76.87000275
Range62.57000256
Interquartile range (IQR)16.41749907

Descriptive statistics

Standard deviation12.4813959
Coefficient of variation (CV)0.3961364895
Kurtosis1.34750385
Mean31.50781671
Median Absolute Deviation (MAD)7.241666794
Skewness1.042551359
Sum45434.2717
Variance155.7852437
MonotonicityNot monotonic
2021-09-27T15:03:44.572096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.149999625
 
0.3%
18.450000765
 
0.3%
34.755
 
0.3%
31.909999854
 
0.3%
30.54
 
0.3%
184
 
0.3%
32.869998934
 
0.3%
29.850000384
 
0.3%
16.950000763
 
0.2%
32.849998473
 
0.2%
Other values (1201)1401
97.2%
ValueCountFrequency (%)
14.300000192
0.1%
14.310000421
0.1%
14.313333191
0.1%
14.320000011
0.1%
14.339999831
0.1%
14.340000151
0.1%
14.359999662
0.1%
14.390000341
0.1%
14.420000081
0.1%
14.439999581
0.1%
ValueCountFrequency (%)
76.870002751
0.1%
76.610000611
0.1%
73.169998171
0.1%
73.099998471
0.1%
73.050003051
0.1%
72.55666861
0.1%
72.510002141
0.1%
72.279998781
0.1%
71.069999691
0.1%
70.860000611
0.1%

percent change
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct1413
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.001164469108
Minimum-0.1795004093
Maximum0.2690450286
Zeros27
Zeros (%)1.9%
Negative650
Negative (%)45.1%
Memory size11.4 KiB
2021-09-27T15:03:44.732713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum-0.1795004093
5-th percentile-0.02633335287
Q1-0.007303437164
median0.000839321342
Q30.009306136269
95-th percentile0.03010720506
Maximum0.2690450286
Range0.4485454379
Interquartile range (IQR)0.01660957343

Descriptive statistics

Standard deviation0.0224419658
Coefficient of variation (CV)19.27227235
Kurtosis23.71171879
Mean0.001164469108
Median Absolute Deviation (MAD)0.008339942576
Skewness0.7783398446
Sum1.679164454
Variance0.0005036418288
MonotonicityNot monotonic
2021-09-27T15:03:44.889147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
027
 
1.9%
0.0029586476652
 
0.1%
-0.0016013176232
 
0.1%
0.011827366342
 
0.1%
-0.029255279751
 
0.1%
-0.014001089561
 
0.1%
0.028749699051
 
0.1%
0.007652330641
 
0.1%
-0.0019268787781
 
0.1%
0.011080375151
 
0.1%
Other values (1403)1403
97.3%
ValueCountFrequency (%)
-0.17950040931
0.1%
-0.1325104521
0.1%
-0.12339741191
0.1%
-0.11868133881
0.1%
-0.11625186071
0.1%
-0.11473737531
0.1%
-0.10125803071
0.1%
-0.08752926771
0.1%
-0.084026874311
0.1%
-0.082488866241
0.1%
ValueCountFrequency (%)
0.26904502861
0.1%
0.15405265531
0.1%
0.13725485311
0.1%
0.12135355881
0.1%
0.094040115221
0.1%
0.090316222481
0.1%
0.083966106271
0.1%
0.074331019691
0.1%
0.072510319291
0.1%
0.070471915721
0.1%

bins
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
no change
1208 
rise
129 
drop
 
105

Length

Max length9
Median length9
Mean length8.188626907
Min length4

Characters and Unicode

Total characters11808
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdrop
2nd rowno change
3rd rowno change
4th rowno change
5th rowno change

Common Values

ValueCountFrequency (%)
no change1208
83.8%
rise129
 
8.9%
drop105
 
7.3%

Length

2021-09-27T15:03:45.175863image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-27T15:03:45.255422image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
no1208
45.6%
change1208
45.6%
rise129
 
4.9%
drop105
 
4.0%

Most occurring characters

ValueCountFrequency (%)
n2416
20.5%
e1337
11.3%
o1313
11.1%
1208
10.2%
c1208
10.2%
h1208
10.2%
a1208
10.2%
g1208
10.2%
r234
 
2.0%
i129
 
1.1%
Other values (3)339
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter10600
89.8%
Space Separator1208
 
10.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n2416
22.8%
e1337
12.6%
o1313
12.4%
c1208
11.4%
h1208
11.4%
a1208
11.4%
g1208
11.4%
r234
 
2.2%
i129
 
1.2%
s129
 
1.2%
Other values (2)210
 
2.0%
Space Separator
ValueCountFrequency (%)
1208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin10600
89.8%
Common1208
 
10.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n2416
22.8%
e1337
12.6%
o1313
12.4%
c1208
11.4%
h1208
11.4%
a1208
11.4%
g1208
11.4%
r234
 
2.2%
i129
 
1.2%
s129
 
1.2%
Other values (2)210
 
2.0%
Common
ValueCountFrequency (%)
1208
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII11808
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n2416
20.5%
e1337
11.3%
o1313
11.1%
1208
10.2%
c1208
10.2%
h1208
10.2%
a1208
10.2%
g1208
10.2%
r234
 
2.0%
i129
 
1.1%
Other values (3)339
 
2.9%

Interactions

2021-09-27T15:03:16.803497image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:16.926048image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:17.066308image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:17.328731image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:17.462588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:17.667541image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:17.855502image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:18.344345image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:18.507877image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:18.657829image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:18.797156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:18.985681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:19.106753image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:19.231557image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:19.416534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:19.544656image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:19.682388image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:19.904624image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:20.199797image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:20.365144image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:20.803911image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.061632image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.255139image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.382187image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.507598image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.642120image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.771061image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:21.920570image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:22.299563image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:22.452022image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:22.605592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:22.754024image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:22.958853image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.152037image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.278296image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.398628image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.526210image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.651354image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.788955image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:23.950156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:24.085906image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:24.240206image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:24.394429image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:24.661645image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:24.853545image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:25.009033image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:25.157730image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:25.324418image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:25.465935image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:25.650889image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:26.180216image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:26.631767image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:26.900659image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:27.217921image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:27.449814image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:27.623775image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:27.798156image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:28.815015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:29.124308image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:29.312772image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:29.515953image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:29.697829image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:29.909260image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:30.105260image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:30.440442image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:30.984158image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:31.192717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:31.334378image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:31.501070image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:31.757051image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:32.298925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:32.562996image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:32.703093image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:32.843301image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:32.989009image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.136381image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.267203image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.397450image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.532663image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.670084image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.811666image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:33.949712image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.095370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.259376image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.416625image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.571083image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.721098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.857371image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:34.992723image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.126115image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.256322image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.394986image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.532459image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.674147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.817253image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:35.960721image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.107579image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.249398image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.415331image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.551500image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.673104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.791025image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:36.917532image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.040939image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.167644image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.299652image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.429641image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.563422image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.693597image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.812726image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:37.934568image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.055216image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.172516image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.295630image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.417028image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.553737image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.679938image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.810030image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:38.942394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:39.070616image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-09-27T15:03:39.196402image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-09-27T15:03:45.351655image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-09-27T15:03:46.157229image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-09-27T15:03:46.393567image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-09-27T15:03:46.608610image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-09-27T15:03:46.791312image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-09-27T15:03:39.599262image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-09-27T15:03:40.035195image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

datetweetusernamementionshashtagscashtagsvideophotosurlsreplies_countretweets_countlikes_countnumber of tweetspricepercent changebins
02016-08-24 16:00:00@dangillmor the process is not as clear as it could be. That's on us. We're taking all the feedback to make it better!jack00000014317118.250000-0.029255drop
12016-08-25 09:30:00@Stammy @Square @niw @RuaneFootball @martucci @cjburrows @thecleanmachine @goyal_soniya @yuriyr @pon_brandon @LindaxGuo 💯jack90000016126118.3300000.004384no change
22016-08-29 16:00:00@Jayanta and happy birthday! 🙌🏼🙌🏼🙌🏼 https://t.co/xbd8inh0Ky Power your TouchBistro or Vend point of sale with Square payments and hardware! https://t.co/VYBWJtSyhRjack1000023783270318.4699990.004350no change
32016-08-31 09:30:00💬💸 https://t.co/fBhEDusY7ajack0000012461168118.3899990.000544no change
42016-09-01 16:00:00"Hey Siri, send John $10" https://t.co/0gbGP0jpzdjack00000124115250119.5000000.006711no change
52016-09-02 09:30:00YES 🙏🏼🙏🏼🙏🏼 https://t.co/N3qwCQXbIx https://t.co/53EONK167qjack0001112024118119.6200010.006154no change
62016-09-07 09:30:00@IStandWithAhmed @JesseDorogusker great meeting you Ahmed!jack00000091155120.0499990.006021no change
72016-09-08 09:30:00🆕 from @120Sports! https://t.co/cdLhHr14uKjack100001174685118.809999-0.053347drop
82016-09-10 16:00:00Watch 🏈 LIVE on Twitter #AFvsGSU https://t.co/Ee7xEKAotX “Deskbound” by Kelly Starrett https://t.co/CyhN1tfme2jack01000251103236218.123334-0.007122no change
92016-09-12 16:00:00Happy Birthday @adambain! 🎂👟🏀jack1000003030301118.1500000.010579no change

Last rows

datetweetusernamementionshashtagscashtagsvideophotosurlsreplies_countretweets_countlikes_countnumber of tweetspricepercent changebins
14322021-07-09 09:30:00@JonLetchford Looking at this @papiflorida No @0xAfterburner That’s the plan @BrokenUniscorn That’s the plan @deyonte_btc That’s the plan We’re doing it #Bitcoin Mike was an incredible light for everyone around him. Grateful for the time we had. And a reminder to express it more while we’re together. 💔jack0100006619426596767.0700000.003591no change
14332021-07-09 16:00:00@elirousso @CashApp Thank you!jack0000003412306168.9700010.028329rise
14342021-07-14 16:00:00@nic__carter Nah 👋🏼jack0000001613202932017270.269997-0.002130no change
14352021-07-16 09:30:00Ok we made a twitter: @tbd54566975 @bits_infinity @b_buzzkill Was hoping you’d know @nahaarisnotrust $SQ @donovan_hat 🤔 @abbasi_z Too Bad Dude @themstems To Be Dynamic @Gardner ¯\_(ツ)_/¯ This @b_buzzkill Why wouldn’t you @Jac0van It is TBD @stephanlivera This is the (only) way @wongmjane Much better We’ll set up Twitter and github accounts soon and update this thread on where to find them. How is this different from @SqCrypto? Square doesn’t give direction to @SqCrypto, only funding. They chose to work on LDK, and are doing an incredible job! TBD will be focused on creating a platform business, and will open source our work along the way. Like our new #Bitcoin hardware wallet, we’re going to do this completely in the open. Open roadmap, open development, and open source. @brockm is leading and building this team, and we have some ideas around the initial platform primitives we want to build. Square is creating a new business (joining Seller, Cash App, &amp; Tidal) focused on building an open developer platform with the sole goal of making it easy to create non-custodial, permissionless, and decentralized financial services. Our primary focus is #Bitcoin. Its name is TBD.jack42100034518224591871668.5599980.007198no change
14362021-07-16 16:00:00@BustaRhymes https://t.co/FjiaS84QH2 @m_tmkns 🙏🏼jack000001972742301266.410004-0.031359drop
14372021-07-17 09:30:00https://t.co/L49eiWThl9jack0000012823121732167.4533310.015710no change
14382021-07-17 16:00:00❤️ https://t.co/c4yEfiys5vjack0000012941771600166.280001-0.017395no change
14392021-07-18 16:00:00@MikeTyson https://t.co/0j3aZAXcwpjack0000012776643166.149999-0.002964no change
14402021-07-19 16:00:00@elonmusk @BitcoinMagazine @CathieDWood Can I borrow a wig?jack0000003993789588166.0199970.011956no change
14412021-07-20 09:30:00Square Banking is live! Checking, savings, debit card for small businesses: https://t.co/FacoHOogPd https://t.co/Sip86oI6fU @elonmusk @BitcoinMagazine @CathieDWood Or we go with this one instead: https://t.co/c6yN8q09hAjack0001126658477807366.2500000.003484no change